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Abstract 

Regular EX-forests continue to play an important role in program- 
ming languages, specifically in the design of type systems [MiR85, 
AM91, V0I93]. They arise naturally as terms of constructor-based, re- 
cursive data types in logic and functional languages. Deciding whether 
the intersection of a sequence of regular SX-forests is nonempty is an 
important problem in type inference. We show that this problem is 
PSPACE-hard and as a corollary that the problem of constructing a 
regular EX-grammar representing their intersection is PSPACE-hard. 



1 Introduction 

Regular SX-forests are playing an increasingly important role in language 
design and in particular in the design of type systems. Type inference then 
usually relies upon various operations over regular forests, one of which is 
RF-INT , deciding the emptiness of their intersection. 

Definition 1.1 The problem RF-INT is given a sequence of regular SX- 
grammars Gi, . . . , Gm, decide whether n]T=i is nonempty. 
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Regular forests have been used to characterize the types of logic and func- 
tional programs [Mis84, MiR85, HeJ90, AM91] as well as overloadings intro- 
duced through classes in Haskell [Kae88, Vol93]. For example, Heintze and 
Jaffar propose what amounts to regular SJA-grammars as inferred “types” 
or approximations of the semantics of logic programs. Corresponding to a 
logic program, say 

p(a). 

P(/(A')) - p(A). 
r(6). 

’•(HY)) ^ r(y). 

,(Z)-p(Z),r(Z). 
is a set of equations 

= a U /(A) 

Y = bUf{Y) 
z = xr\Y 

whose simultaneous least fixed point is an approximate meaning of the pro- 
gram. The inferred approximation or “type” is given by 

X = aU /(A) 

Y = bUfiY) 

Z = ID 

Solving for variable Z requires deciding whether the intersection of the two 
regular forests described by the first two equations is nonempty. 

One can also view the logic program above as describing a set of valid 
overloadings in Haskell for p and r as operators where p has instances at 
types a and /, and r at 6 and /: 

class P Q where p::a 

instance Pa where p = . . . 

instance PA"" => P/(A^) where p =.. . 

class R a where r :: a 

instance R b where r = . . . 

instance PF => P /(F) where r = .. . 

Instance declarations for an overloaded operator in Haskell describe a regular 
forest. So for example, deciding whether term p = r is typable requires 
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deciding whether the regular forest arising from p's instance declarations 
intersects with the forest described by instances for r. 

2 Forests and Regular EX-grammars 

Given an alphabet A, an >l-valued tree t is specified by its set of nodes (the 
“domain” dom{t)) and a valuation of the nodes in A. Formally, a k-ary, 
j4-valued tree is a map t : dom{t) A where dom{t) C {0, — 1}* is a 
nonempty set, closed under prefixes. The frontier of t is the set 

{tn G dom{t) \ ->3i.wi € dom{t)]. 

It is assumed that A is partitioned into a ranked alphabet S and a frontier 
alphabet X. A ranked alphabet, or signature, is a finite nonempty operator 
domain. For any S and X, we denote the set of all finite SA’-trees by F^{X). 
A forest, or tree language, T C F^{X) is called regular if and only if for some 
finite set C disjoint from S and X, T can be obtained from finite subsets 
of F^{X U C) by applications of union, concatenation -c (defined using tree 
substitution), and closure where c € C [Tho90]. 

A regular forest can alternatively be defined as a tree language generated 
by a regular SA'-grammar [GeS84]. 

Definition 2.1 A regular Y.X -grammar G consists of 

• a finite nonempty set N of nonterminal symbols, 

• a finite set P of productions of the form A —y r where A E N and 
r G Fj:{N U A^), and 

• an initial symbol S E N. 

Definition 2.2 If G = {N,Ti, X, P, S) is a regular HX-grammar then the 
TiX -forest generated by G is 

T(G) = {( e I S () 

Regular SA-grammars are a class of context-free grammars that define 
the same family of forests as those recognized by nondeterministic root-to- 
frontier (NDR) SA-automata. A root-to-frontier automaton can be viewed 
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as an attribute evaluator for a tree whose attributes are states prescribed 
by an attribute grammar with inherited attributes only. Formally, a NDR 
SX-automaton A is a tuple (A, A', o) such that 

1 . A is a finite NDR E-algebra (A, S), 

2 . A' C A is a set of initial states, and 

3 . a : X pA is a final assignment 

In a NDR S-algebra (A, E), A is a nonempty set of states and every 
a G Eto with m > 1 is realized as a mapping : A —* p(A’"). For <r € Eq, 
is a subset of A. 

For example, a NDR EA-automaton A = {A, A', a) recognizing set 
{a{x,y), a{y,x)) 

can be defined as follows. Let E = E2 = {cr}, X = {x,y}, and the set of 
initial states A' = { 5 }. Define A= ({i, y, 5 },E) such that 

and finally define the final assignment a as 

xa = {y} 
ya = {x} 

It is interesting to note that there is no deterministic root-to-frontier EA- 
automaton that accepts the set above. Suppose automaton A accepts cr(x, y) 
and (j{y,x) and that <r(a) = (01,02) for some states o, oi, and 02 of A. If ot 
is A’s final assignment function, then 

xa = Oi, ya = 02, ya = oj, xa = 02 

Since A is deterministic, Oi = 02. So we have <r(o) = (oi,Oi) where xa = 
ya = Oi. Therefore on o(x,x) and <x{y,y), A enters the leaves in state Oi 
such that Oi G xa, and Oi G ya. Thus A accepts (t(x,x) and (r{y,y) as well. 

Given that regular EA-grammars define exactly the forests recognized by 
NDR EA-automata, one could formulate RF-INT in terms of the latter rep- 
resentation of regular forests. But we choose regular EA-grammars instead 
since they are better suited for manipulation. 

Regular forests are effectively closed under intersection. 
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Theorem 2.1 If G\ and G2 are regular TiX -grammars, for a given S and 
X, then T(Gi) fl T{G2) is a forest generated by a regular YiX -grammar. 

Proof. Suppose Gi = (iVi, S, X, Pi, Si) and G2 = {N2, S, X, P2, S2) are regu- 
lar SA’-grammars. Let SX-grammar G = (A^i x A^2> 2 , X, P, [ 5 i, 52]) where 

[A, 5 ]^a([Ki,Zi],...,[r„,Z„])€P, for n >0 

if and only if 

A-^a{Yi,...,Y,,)€Pi, 

B — > a(Zi, . . . , Zn) € P2, 

and a € S, or [A,B\ — » a 6 P if and only if a € X. Then T{G) = 
T{Gi) n T{G2). □ 

The theorem implies that the family of regular forests is properly con- 
tained within the context-free languages since the latter is not closed under 
intersection. 

We now state and prove the main result. 

Theorem 2.2 RF-INT is PSPACE-hard. 

Proof. The proof uses a result of [Koz 77 ]. For every deterministic Turing ma- 
chine M of polynomial space complexity, we give a log-space transducer that 
on input x, outputs a sequence of regular SX-grammars whose intersection 
is nonempty iff M accepts x. 

Let Af be a single tape DTM of polynomial space complexity p{n) > n 
and assume that M always makes at least three odd number of moves, has a 
unique accepting state, qacc, and erases its tape before accepting, positioning 
its tape head at the left end of the tape. Let x = . . . a„ be a string over 

M’s input alphabet and suppose M has states Q and tape symbols T such 
that Q, r, and set {ne 7 ,^,#^} are pairwise disjoint. If 

A = ru{[gX] |9€Q kx eT} 

then ranked alphabet S = Eq U Sj U S2 U S3 where So = {nil}, Si = A, 
S2 = {##} and S3 = {#}. Suppose ID^ derives regular forest 

Zi {Z2 (• • • ■^p(n) (nil) • • •) 
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for all Zfc € A, 1 < /; < p(n), and ^jgpjygg regular forest 

(• • • Z,_1 (X, {X2 (X3 (Zi (• • . Z,^r^)- 3 {nil) ■ • •) 

for all Xi,X 2 ,Xs, Zk £ A, 1 < k < p{n) — 3. 

A computation of M consists of a sequence of instantaneous descriptions 
IDq 1- IDi h • • • h /Z) 2 m+i> each containing the contents of M’s tape padded 
with blanks (B's) to length p{n). If according to a move of M, symbols 
^ 1 ^ 2 ^ in positions i, i + 1, and i + 2 respectively of an ID can follow from 
symbols ^’1X2X3 in the same positions of another ID, we write 

We give two regular SAT-grammars and such that F°‘^‘^ ensures 

that even ID’s follow from odd ones, and that odd ones follow from 

even ones. Let F°'^'^ be a regular EA^-grammar with empty frontier alphabet, 
start symbol S and productions 

S #[ID^, JJ^[Z^Z2Z,]^ p\z,Z2Z2]^ 
for all Zfc € A, 1 < < 3, 

p[XiX2X2] ^ ^^j^[Y,Y2Yi] jj^[ZiZ2Z3] p[Z,Z2Zi]^ 

for all Xk,Yk,Zk € A, 1 < A: < 3, such that and 

for all Xk,Yke A, \ <k< 3, such that Hm IDY'^^^^\ 

Let be a regular EX-grammar with empty frontier alphabet, start 

symbol S and productions 

for all Xk,Yk£A,\<k< 3, such that Hm 

Finally, suppose initID derives the unary tree 

koOi](a2(- • • an(jB„+i(- • • J5p(„)(m7) • • •) 
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where is a blank and 90 is the start state of M, and finallD derives 

[qaccB]{B2{‘ “ ■ ■ -) 

Then let ^ regular grammar with start symbol 5 and productions 

5 i^{initID,ID^,Facc) 

Face ID A, Face) 

Face 4 MIDA,finalID) 

Then we have 

p(n)-2 

U6 n 

t=l 

iffu = #(/I>0, /T>1, #(• • • #(/D2m-2, /I>2m-1, ##(/T>2m, /T>2m+i) ‘ ‘ •) and from 
ID2k-\ follows ID 2k according to the transition rules of M for 1 < k < m. 
Likewise, 

p(n)-2 

ue n T{Fr") 

j=i 

iff u = #(/D„, /i>., #(■ ■ /D2„-„ . • •) and from 

ID2k follows ID2k+i according to the rules of M for 0 < k < m. Then 

p(n)-2 

T{Fend)n n T{Ft‘^‘^)nT{Fm 

t=l 

is nonempty iff M accepts x. □ 

As is the case for emptiness of intersection of a sequence of DFA’s, the 
source for the hardness of RF-INT lies not in deciding emptiness but rather 
in computing the intersection of regular forests. 

Corollary 2.3 Given regular T,X -grammars Gi, . . . , Gm, constructing a reg- 
ular TiX-grammar G such that T{G) = F{Gk) is PSPACE-hard. 

Proof. The emptiness of T{G) for a regular SX-grammar G is decidable 
in time 0 {\ G P) in the usual way. From the proof of Theorem 2.2 then 
every problem in PSPACE is P-time Turing reducible to the problem of 
constructing the intersection of a sequence of regular SX-grammars. □ 
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A simple algorithm for constructing G is based on the usual construction 
of forming the cartesian product of reachable states as is suggested in the 
proof of Theorem 2.1 [AiM91]. It has worst-case time complexity exponential 
in m. Unfortunately this naive construction is likely the best we can do. It 
should be pointed out that for a fixed m, constructing G from Gi, . . . ,Gm 
can be done in polynomial time. 

Deciding whether some number of DFA’s accept a common string can be 
done in nondeterministic linear space, but this does not appear to be true 
for RF-INT, which can be decided in deterministic exponential time. This 
suggests that a tighter lower bound exists for RF-INT. 
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